Does Data Sharing in Consent Actually Reduce Participation?

Examining Data Collection Type and Concern Resolution



Jürgen, Salome, Tina, Johannes, Talha

17 September 2025

Evidence so far




Evidence so far




Research Questions



  1. Do participants read informed consent forms?

  2. Does the information on sharing data influence the participants’ willingness to participate?
    moderated my data type?

  3. Does the option to clarify concerns about the informed consent make a difference in the willingness to participate?

Study 1




Do participants read
informed consent forms?

Study 1

Design

  • Re-use of log-data from past online surveys, that…
    • measured dwell time on the consent page,
    • allow re-use of their data according to the informed consent

  • Sample
    • [Teachers, students, …]
    • \(N_{participants} = ...\) participants
    • from \(N_{studies} = ...\) online surveys ([year begin - year end])

Study 1

Measures

Dwell time on consent: Reading speed (as words per minute):

  • 238: average (silent) reading speed (English) (Brysbaert, 2019)
  • > 500-600: loss of information/comprehension (skimming)
  • > 900: information falls out of working memory before being comprehended (Masson, 1983)

Additional coding

  • Which population? [students, teachers, public, researchers, …]
  • Participants incentivized? How? [none, money, …]
  • Topic of study? [sensitive, somewhat sensitive, not sensitive]

Study 1

Statistical analysis

Study 1

Statistical analysis

# Daten werden je nach Sprache zentriert (silent reading speed)
data <- data %>%
  mutate(words_per_minute_centered = case_when(
    language == "German"  ~ words_per_minute - 260,
    language == "English" ~ words_per_minute - 238
  ))



############################################################################## #
## With random intercept                                                    ####
############################################################################## #

fit_gamma_RE <- brm(
  words_per_minute ~ 1 + (1 | study),
  data = data,
  family = Gamma(link = "log"),
  chains = 4, iter = 4000, cores = 4, seed = 123
)

# --- get posterior mean and CI ---
# Posterior: erwarteter Populationsmittelwert (auf Originalskala)
newdata <- data.frame(study = NA)   # NA in Gruppierungsvariable => neue Ebene (re_formula = NA fixiert random effects auf 0)
epred <- posterior_epred(fit_gamma_RE, 
                         newdata = newdata, # predict these cases
                         re_formula = NA)  # ignore random effects, gets population average (only fixed effects)
epred_vec <- as.numeric(epred[,1])         # get vector with posterior draws

mean(epred_vec)                       # Posterior mean of population expected reading_speed
quantile(epred_vec, c(0.025, 0.975))  # 95% CI

# --- hypothesis() on log-Scale ---
log_500 <- log(500 - 238) # words per minute for skim reading and reading limit 
log_900 <- log(900 - 238) # aren't clearly defined. We therefore use the English
                          # language value (and not both languages separately)
                          # to subtract from 500 and 900

hypothesis(fit_gamma_RE, "Intercept > 0")
hypothesis(fit_gamma_RE, paste0("Intercept > ", log_500))
hypothesis(fit_gamma_RE, paste0("Intercept > ", log_900))





############################################################################## #
## Without random intercept                                                 ####
############################################################################## #
library(brms)

# Gamma-Modell mit log-Link, nur Intercept
fit_gamma_noRE <- brm(
  words_per_minute ~ 1,
  data = tmp,
  family = Gamma(link = "log"),
  chains = 4, iter = 4000, cores = 4, seed = 123
)


# --- get posterior mean and CI ---
epred <- posterior_epred(fit_gamma_noRE)
epred_vec <- as.numeric(epred[, 1])

mean(epred_vec)                       # Posterior mean
quantile(epred_vec, c(0.025, 0.975))  # 95%-Intervall

# --- hypothesis() on log-Scale ---
log_500 <- log(500 - 238) # words per minute for skim reading and reading limit 
log_900 <- log(900 - 238) # aren't clearly defined. We therefore use the English
                          # language value (and not both languages separately)
                          # to subtract from 500 and 900

hypothesis(fit_gamma_RE, "Intercept > 0")
hypothesis(fit_gamma_RE, paste0("Intercept > ", log_500))
hypothesis(fit_gamma_RE, paste0("Intercept > ", log_900))

Study 2a

Conceptual Replication (and expansion) of Study 1



Do participants read
informed consent forms?

replication part
expansion part

Study 2a

Conceptual Replication (and expansion) of Study 1

Do participants read consent forms?
(spend enough time on informed consent pages in online surveys to plausibly read and process the information presented?)

  • Procedure: Presenting the actual consent, then if participants agree

  • Measures:
    Dwell time
    Self-reported reading engagement
    Understanding (e.g., via multiple-choice)

Study 2a

Statistical analysis

Study 2a

Conceptual Replication (and expansion) of Study 1

Who reads consent forms?
What individual characteristics predict whether participants read and understand informed consent forms?

  • Procedure: Presenting the actual consent, then if participants agree

  • Measures:
    Dwell time
    Self-reported reading engagement
    Understanding (e.g., via multiple-choice)

  • Predictors
    Perceived Trust
    Perceived Risk of Participation
    Experience as Research Participant
    Experience Conducting Studies

Study 2a

Conceptual Replication (and expansion) of Study 1

Study 2a

Conceptual Replication (and expansion) of Study 1

library(brms)

fit_mediation <- brm(
  data = mydata,
  formula = 
    bf(dwell_time ~ experience_participant + experience_conducting + trust_benevolence + perceived_risk,
       family = Gamma(link = "log")) +  # log link as data are surely skewed
    bf(reading_engagement ~ experience_participant + experience_conducting + trust_benevolence + perceived_risk,
       family = student()) +
    bf(understanding ~ experience_participant + experience_conducting + trust_benevolence + perceived_risk +
         dwell_time + reading_engagement,
       family = student()) +
    set_rescor(TRUE),
  chains = 4, iter = 4000, cores = 4, control = list(adapt_delta = 0.95)
)

posterior <- as_draws_df(fit_mediation)

# Indirect effects over dwell time ############################################
indirect_dwell <- posterior$b_dwelltime_experience_participant * posterior$b_understanding_dwell_time +
                  posterior$b_dwelltime_experience_conducting * posterior$b_understanding_dwell_time +
                  posterior$b_dwelltime_trust_benevolence * posterior$b_understanding_dwell_time +
                  posterior$b_dwelltime_perceived_risk * posterior$b_understanding_dwell_time

# Credible Interval + Posterior mean
quantile(indirect_dwell, c(0.025, 0.975))
mean(indirect_dwell)


# Indirect effects over reading engagement #####################################
indirect_engagement <- posterior$b_readingengagement_experience_participant * posterior$b_understanding_reading_engagement +
                  posterior$b_readingengagement_experience_conducting * posterior$b_understanding_reading_engagement +
                  posterior$b_readingengagement_trust_benevolence * posterior$b_understanding_reading_engagement +
                  posterior$b_readingengagement_perceived_risk * posterior$b_understanding_reading_engagement

# Credible Interval + Posterior mean
quantile(indirect_engagement, c(0.025, 0.975))
mean(indirect_engagement)


# total indirect effects #######################################################
indirect_total <- indirect_dwell + indirect_engagement

# Credible Interval + Posterior mean
quantile(indirect_total, c(0.025, 0.975))
mean(indirect_total)

Study 2b

Effects of data sharing and data type




Does the information on sharing data influence the participants’ willingness to participate?

(for different data types?)

Study 2b

Effects of data sharing and data type

  • Procedure:
    1. Description & consent of fictitious study presented (see on the right)
    2. then questionnaire

  • Design: 2x3 (within?-)between-subjects design

  • Factors:
    • Data-sharing section (included vs. not included)
    • data collection type (video recording vs. interview vs. survey)

  • Randomized:
    • topic of study (varying degrees of sensitivity)

(Hover over examples to zoom)

Study 2b

Effects of data sharing and data type

  • Measures:
    • Willingness to participate (6-point Likert scale)

  • Treatment checks
    • perceived sensitivity of topic of study
    • Dwell time
    • Self-reported reading engagement
    • Understanding

(Hover over examples to zoom)

Study 2b

Effects of data sharing and data type

Structural model only (without measurement models)
perceived_sensitivity as control

Study 2b

Effects of data sharing and data type

perceived_sensitivity and its interaction as control

library(brms)

fit <- brm(
  formula = bf(willingness_participate ~ data_sharing * data_type_interview + 
                                         data_sharing * data_type_video +
                                         data_sharing * perceived_sensitivity),
  data = mydata,
  family = gaussian(),
  chains = 4,
  cores = 4,
  iter = 4000,
  warmup = 1000,
  control = list(adapt_delta = 0.95))

Study 2c




Does the option to clarify concerns about the informed consent make a difference in the willingness to participate?

Study 2c

Effects of information on concerns

Study 2c

Effects of information on concerns

Pilot Study

  • Show consent
  • Open question: “What would be your greatest concern?”
  • Open question: “What would dispel these concerns?”

Analysis

  • Synthesize topics in concerns
  • Evaluate whether participants want informational or emotional response

Study 2c

Effects of information on concerns

  • Procedure: Reading consent & T1 from Study 2b
    • voice their greatest concern with participating in the study (open/ closed format?)
    • treatment: information on concern (yes vs. no)
    • measurements: willingness to participate
  • Design: pre-post between-subject RCT
  • Factor (treatment): Information on concern [yes vs. no]
    • EG: information on voiced concern displayed
    • KG: neutral information displayed (see on the right)
  • Measures (pre-post)
    • willingness to participate
    • Perceptions of being well-informed about study
    • trust in research?


“Which of these potential concerns about the study do you consider the most significant?” t1p.de/concern-response


“Thank you for sharing your thoughts with us. We know that participants have different perspectives on this topic. So far, there has been little research on the concerns people have about participating in scientific studies. With this study, we would like to contribute to closing this research gap.”

Study 2c

Effects of information on concerns

Read informed consent Measure willingness to participate Voice greatest concern Measure willingness to participate Treatment Measure willingness to participate
Experimental Group (EG) ✔️ ✔️ ✔️ ✔️ Information on concern: Concern-specific feedback ✔️
Control Group (KG) ✔️ ✔️ ✔️ ✔️ Neutral information ✔️

Study 2c

Concern Response
Where and how exactly will my data be stored? How can I be sure my personal data will remain secure and confidential? Your personal data will be securely stored on protected servers at a specialized research data center that complies with GDPR standards. Only authorized researchers will have access, following strict data protection protocols to prevent unauthorized use or breaches. Data will be retained for at least 10 years in accordance with research integrity guidelines.
Will others be able to identify me and relate my data with my person? Whenever possible, your data will be anonymized so that no one can identify you. In cases where anonymization is not feasible (e.g., video recordings or interviews), your data will be securely stored in a specialized research data center. Only authorized researchers who submit a request, transparently detailing their intended use of the data, will be granted access. No other individuals will be able to view or use your data.
What does data sharing mean? Who will have access to my data in the future, and how will it be used? If you consent, your data may be stored in a specialized research data center. Researchers who wish to access the data must submit an application detailing their research plan transparently. Only approved researchers will be granted selective access, and they are required to delete the data after completing their analyses. They are not allowed to further distribute the data. Your data will not be shared with commercial entities or used for purposes unrelated to research.
If I withdraw my consent, what happens to the data that has already been collected? What if my data has already been analyzed or published? You can withdraw your consent at any time by contacting the research team. After withdrawal, we will ensure that your data is no longer used in the study. This means that any data that can be linked to you will be fully deleted. However, results that have already been published in scientific research reports cannot be removed. In these publications, no one will be able to identify you.
Could my data be used in ways that I haven’t agreed to or that could harm me? Your data will only be used for research purposes that align with ethical and legal standards. It will not be shared with third parties for commercial or unauthorized use. Any reuse of your data by other researchers must comply with strict ethical guidelines to prevent misuse. Researchers who wish to access your data must submit a transparent plan detailing their intended use and sign a contract binding them to this specific purpose only. Access will only be granted after signing this contract. They are not permitted to further distribute the data and must delete it after completing their analyses.
How can I be sure the data are handled ethically? This study has been reviewed and approved by an ethics board to ensure that it meets ethical research standards. This includes evaluating potential risks, ensuring informed consent, and safeguarding participants’ rights and well-being.

Study 2c

Effects of information on concerns

Structural model only (without measurement models)

Questions



General: Idea for a journal? -> registered report?

Study 1: How to ideally collect log-data from past online surveys -> ask in LLiB?

Study 2: How to collect participants? Panel provider probably biased

Study 2b: What if participants don’t read the fictitious example? -> ecological validity? internal validity?

Study 2c: What does CG do?

Feedback (Lab Report)

  • Concerns: Teilnehmende bringen eher diffuse concerns mit “ich habe angst, dass andere über mich lachen”, die vorgeschlagenen concerns sind sehr spezifisch RDM
  • Unsere Responses auf die Concerns
    • passivkonstruktionen, zu sehr RDM-orientiert
    • zu abstrakt
    • Sind unsere Responses formuliert, um die Teilnahme zu steigern oder ehrlich? (z.B. können die Daten leaken? Eigentlich nicht, potentiell aber immer ja)
    • Mehr dialogisch, persönlich, “bring out the people in our research” (there are people that collect your data…)
    • Antworten in Einfacher Sprache?
  • Sind wir wirklich an concerns interessiert (emotional level and response) oder eher an misconceptions (informational level and response)
  • Study 2c: Additional control variable: Experiences with studies
  • nicht nur “willingness to participate”, sondern auch “willingness to share data” abfragen -> äquivalent zu vielen informed consent Formaten, die beides separat abfragen

Thank you



Jürgen Schneider

References

Beck, C. T. (2019). Secondary Qualitative Data Analysis in the Health and Social Sciences (C. T. Beck, Ed.; 1st ed.). Routledge. https://doi.org/10.4324/9781315098753
Brysbaert, M. (2019). How many words do we read per minute? A review and meta-analysis of reading rate. Journal of Memory and Language, 109, 104047. https://doi.org/10.1016/j.jml.2019.104047
Campbell, R., Goodman-Williams, R., Engleton, J., Javorka, M., & Gregory, K. (2023). Open science and data sharing in trauma research: Developing a trauma-informed protocol for archiving sensitive qualitative data. Psychological Trauma: Theory, Research, Practice, and Policy, 15(5), 819–828. https://doi.org/10.1037/tra0001358
Campbell, R., Goodman-Williams, R., Javorka, M., Engleton, J., & Gregory, K. (2023). Understanding Sexual Assault SurvivorsPerspectives on Archiving Qualitative Data: Implications for Feminist Approaches to Open Science. Psychology of Women Quarterly, 47(1), 51–64. https://doi.org/10.1177/03616843221131546
Douglas, B. D., McGorray, E. L., & Ewell, P. J. (2021). Some researchers wear yellow pants, but even fewer participants read consent forms: Exploring and improving consent form reading in human subjects research. Psychological Methods, 26(1), 61–68. https://doi.org/10.1037/met0000267
Fichtner, U. A., Horstmeier, L. M., Brühmann, B. A., Watter, M., Binder, H., & Knaus, J. (2023). The role of data sharing in survey dropout: A study among scientists as respondents. Journal of Documentation, 79(4), 864–879. https://doi.org/10.1108/JD-06-2022-0135
Geier, C., Adams, R. B., Mitchell, K. M., & Holtz, B. E. (2021). Informed Consent for Online ResearchIs Anybody Reading?: Assessing Comprehension and Individual Differences in Readings of Digital Consent Forms. Journal of Empirical Research on Human Research Ethics, 16(3), 154–164. https://doi.org/10.1177/15562646211020160
Ittenbach, R. F., Senft, E. C., Huang, G., Corsmo, J. J., & Sieber, J. E. (2015). Readability and Understanding of Informed Consent Among Participants With Low Incomes: A Preliminary Report. Journal of Empirical Research on Human Research Ethics, 10(5), 444–448. https://doi.org/10.1177/1556264615615006
Kamath, Y. V., Shetty, Y. C., Lanjewar, I. C., & Kulkarni, A. (2025). Readability of informed consent documents and its impact on consent refusal rate. Perspectives in Clinical Research, 16(1), 38–43. https://doi.org/10.4103/picr.picr_322_23
Kuula, A. (2011). Methodological and Ethical Dilemmas of Archiving Qualitative Data. IASSIST Quarterly, 34(3-4), 12. https://doi.org/10.29173/iq455
Lamb, D., Russell, A., Morant, N., & Stevenson, F. (2024). The challenges of open data sharing for qualitative researchers. Journal of Health Psychology, 29(7), 659–664. https://doi.org/10.1177/13591053241237620
Masson, M. E. J. (1983). Conceptual processing of text during skimming and rapid sequential reading. Memory & Cognition, 11(3), 262–274. https://doi.org/10.3758/BF03196973
Mozersky, J., Parsons, M., Walsh, H., Baldwin, K., McIntosh, T., & DuBois, J. M. (2020). Research Participant Views regarding Qualitative Data Sharing. Ethics & Human Research, 42(2), 13–27. https://doi.org/10.1002/eahr.500044
Parfenova, D., Niftulaeva, A., & Carr, C. T. (2024). Words, words, words: Participants do not read consent forms in communication research. Communication Research Reports, 1–11. https://doi.org/10.1080/08824096.2024.2379832
Pedersen, E. R., Neighbors, C., Tidwell, J., & Lostutter, T. W. (2011). Do Undergraduate Student Research Participants Read Psychological Research Consent Forms? Examining Memory Effects, Condition Effects, and Individual Differences. Ethics & Behavior, 21(4), 332–350. https://doi.org/10.1080/10508422.2011.585601
Perrault, E. K., & Keating, D. M. (2018). Seeking Ways to Inform the Uninformed: Improving the Informed Consent Process in Online Social Science Research. Journal of Empirical Research on Human Research Ethics, 13(1), 50–60. https://doi.org/10.1177/1556264617738846
Pietrzykowski, T., & Smilowska, K. (2021). The reality of informed consent: Empirical studies on patient comprehension—systematic review. Trials, 22(1), 57. https://doi.org/10.1186/s13063-020-04969-w
Steinhardt, I., Fischer, C., Heimstädt, M., Hirsbrunner, S. D., İkiz-Akıncı, D., Kressin, L., Kretzer, S., Möllenkamp, A., Porzelt, M., Rahal, R.-M., Schimmler, S., Wilke, R., & Wünsche, H. (2021). Opening up and Sharing Data from Qualitative Research: A Primer: Results of a workshop run by the research group ,,Digitalization and Science“ at the Weizenbaum Institute in Berlin on January 17, 2020. https://doi.org/10.34669/WI.WS/17
Stieglitz, S., Wilms, K., Mirbabaie, M., Hofeditz, L., Brenger, B., López, A., & Rehwald, S. (2020). When are researchers willing to share their data? – Impacts of values and uncertainty on open data in academia. PLOS ONE, 15(7), e0234172. https://doi.org/10.1371/journal.pone.0234172
Ulrich, C. M., Ratcliffe, S. J., Hochheimer, C. J., Zhou, Q., Huang, L., Gordon, T., Knafl, K., Richmond, T., Schapira, M. M., Miller, V., Mao, J. J., Naylor, M., & Grady, C. (2024). Informed Consent among Clinical Trial Participants with Different Cancer Diagnoses. AJOB Empirical Bioethics, 15(3), 165–177. https://doi.org/10.1080/23294515.2023.2262992
VandeVusse, A., Mueller, J., & Karcher, S. (2022). Qualitative Data Sharing: Participant Understanding, Motivation, and Consent. Qualitative Health Research, 32(1), 182–191. https://doi.org/10.1177/10497323211054058
Wisgalla, A., & Hasford, J. (2022). Four reasons why too many informed consents to clinical research are invalid: A critical analysis of current practices. BMJ Open, 12(3), e050543. https://doi.org/10.1136/bmjopen-2021-050543
Xu, A., Baysari, M. T., Stocker, S. L., Leow, L. J., Day, R. O., & Carland, J. E. (2020). Researchers’ views on, and experiences with, the requirement to obtain informed consent in research involving human participants: A qualitative study. BMC Medical Ethics, 21(1). https://doi.org/10.1186/s12910-020-00538-7
Zuiderwijk, A., Shinde, R., & Jeng, W. (2020). What drives and inhibits researchers to share and use open research data? A systematic literature review to analyze factors influencing open research data adoption. PLOS ONE, 15(9), e0239283. https://doi.org/10.1371/journal.pone.0239283

Credit

Title page: Aziz Acharki on Unsplash

Icons by Font Awesome CC BY 4.0